Load libraries - see what is needed as I go
library(FlowSOM)
Loading required package: igraph
Warning: package ‘igraph’ was built under R version 4.1.2
Attaching package: ‘igraph’
The following objects are masked from ‘package:stats’:
decompose, spectrum
The following object is masked from ‘package:base’:
union
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Thanks for using FlowSOM. From version 2.1.4 on, the scale
parameter in the FlowSOM function defaults to FALSE
#library(flowCore)
#library(cluster)
#library(fpc)
#library(clv)
library(Seurat)
Warning: package ‘Seurat’ was built under R version 4.1.2
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Registered S3 method overwritten by 'spatstat.geom':
method from
print.boxx cli
Attaching SeuratObject
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:igraph’:
as_data_frame, groups, union
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(Rphenograph)
Loading required package: ggplot2
rm(list=ls())
Code from website to test installation and to figure out how the inputs work.
membership(Rphenograph_out[[2]])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 3 3 3 3 2 3 3 3 3 3
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
3 2 3 3 3 3 3 2 3 2 3 2 3 3 2 2 3 3 3 3 3 2 3 3 3 3 2 3
141 142 143 144 145 146 147 148 149
3 3 3 3 3 2 3 3 2
The phenograph example works - Apply to the flow data
# read in the data and create an expression matrix
#input file path, change if needed
fileName <-"/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/prepro_outsflowset.csv"
# note: current matrix sample ID have cell index # attached.
df <- read.csv(fileName)
head(df)
print(dim(df)) # this is specific df has 73578 cells
[1] 73578 19
# the preprocessing output csv needs to be cleaned - it contains live dead, FSC, SSC and the sample column
df2 <- df %>% select(-c("Live.Dead",FSC,SSC,X,Batch,cell))
m <- as.matrix(df2) # make a matrix as input to phenograph
unique(df$phenograph_cluster)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Try with a higher k
# Rphenograph seems to be just one function and we can adjust the K for number of neighbours
Rphenograph_out_flow <- Rphenograph(m, k = 271)
Run Rphenograph starts:
-Input data of 73578 rows and 13 columns
-k is set to 271
Finding nearest neighbors...DONE ~ 25.11 s
Compute jaccard coefficient between nearest-neighbor sets...DONE ~ 1747.534 s
Build undirected graph from the weighted links...DONE ~ 101.951 s
Run louvain clustering on the graph ...DONE ~ 87.838 s
Run Rphenograph DONE, totally takes 1962.433s.
Return a community class
-Modularity value: 0.825605
-Number of clusters: 22
modularity(Rphenograph_out_flow[[2]])
[1] 0.825605
membership(Rphenograph_out_flow[[2]])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
1 2 2 2 3 4 5 4 6 3 7 8 9 2 10 11 1 12 2 13 12 5
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
13 14 12 15 9 3 3 3 15 3 8 3 15 9 3 3 3 8 11 1 9 11
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
9 13 9 15 9 13 5 7 16 13 3 17 16 4 2 5 15 2 15 5 16 1
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
9 9 13 9 9 18 13 5 8 5 6 18 6 7 2 1 1 7 9 13 13 3
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
8 17 8 8 3 9 13 13 2 17 19 7 15 9 9 3 8 3 5 15 5 13
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
8 13 16 2 3 1 16 13 4 13 9 4 9 8 13 3 18 8 4 15 1 4
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
2 18 15 9 17 1 3 16 1 1 2 13 13 13 3 8 2 16 14 9 16 3
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
13 16 13 12 2 13 11 8 13 18 4 8 1 8 7 16 4 13 9 5 15 9
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
13 15 3 3 9 3 15 8 4 6 8 3 15 2 13 11 15 20 5 5 17 18
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
8 15 3 9 18 3 7 5 2 3 8 13 21 8 1 3 9 17 4 3 13 8
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242
6 3 9 3 9 18 8 18 1 8 13 13 7 13 2 1 1 7 1 9 5 8
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264
13 5 8 16 5 15 1 13 1 9 1 8 3 6 15 20 8 13 3 17 15 9
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286
13 3 13 15 9 9 2 13 3 8 9 15 13 7 7 13 15 15 9 4 8 15
287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308
3 2 3 9 5 8 9 3 9 8 1 6 7 3 7 8 5 18 8 6 13 17
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330
1 9 9 13 13 8 9 13 9 15 11 9 16 7 5 0.826 4 16 7 3 9 9
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352
4 3 6 16 17 13 15 8 2 21 19 16 13 13 13 13 4 3 17 5 2 9
353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374
12 14 10 2 3 13 8 8 7 13 20 13 3 10 3 13 7 1 13 10 13 7
375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396
13 15 3 3 16 1 11 9 17 1 16 9 9 15 2 13 21 13 9 16 15 15
397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418
15 2 2 6 1 13 11 8 8 1 1 2 13 10 9 2 3 13 18 2 13 15
419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440
3 7 13 13 3 7 13 21 18 13 8 13 2 7 2 7 16 13 9 4 16 12
441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462
20 13 1 3 16 15 13 7 16 3 8 9 7 8 8 13 18 16 16 8 7 20
463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484
9 1 3 8 8 7 8 5 6 10 7 15 5 5 13 1 1 1 3 11 13 5
485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506
9 8 1 8 15 17 8 8 3 13 8 16 17 3 18 2 9 18 1 13 4 8
507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528
7 8 9 9 13 15 2 8 1 18 1 17 7 1 10 20 18 8 7 4 16 15
529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550
3 15 14 13 3 3 4 8 13 15 8 2 18 13 5 13 8 15 16 1 11 9
551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572
15 13 9 1 17 8 8 6 1 13 4 2 8 15 3 15 5 15 17 2 16 11
573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594
8 15 7 7 13 1 5 1 13 11 13 18 1 7 13 9 11 8 1 3 21 16
595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616
2 15 8 13 18 9 1 1 2 4 15 8 5 3 1 16 7 5 13 16 15 13
617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638
4 1 3 8 3 3 13 9 8 2 10 8 13 1 17 4 15 15 8 8 1 15
639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660
6 3 9 5 8 9 1 17 3 8 5 15 13 13 4 9 13 13 13 4 5 11
661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682
9 2 16 8 13 13 9 3 9 3 15 2 9 15 8 3 1 13 8 8 17 5
683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704
1 5 13 16 18 4 16 1 16 11 16 13 9 13 7 3 13 17 7 15 2 6
705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726
10 7 5 8 4 8 11 8 13 8 18 1 8 1 3 6 15 1 3 8 11 13
727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748
13 8 13 18 8 13 6 13 2 7 1 15 5 17 8 5 7 9 8 14 3 3
749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770
3 17 13 5 5 17 8 8 2 15 16 5 9 4 15 17 1 8 15 15 8 17
771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792
8 5 15 13 15 9 13 1 2 3 20 8 3 8 16 5 6 13 2 14 13 20
793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814
1 4 15 5 20 2 3 8 8 18 9 20 8 6 8 16 6 1 15 9 5 7
815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836
3 3 8 3 9 20 5 18 3 3 1 9 13 14 7 5 9 2 17 4 3 11
837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858
8 7 5 12 1 15 15 11 8 15 1 7 9 8 4 9 3 3 20 5 2 5
859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880
15 1 13 7 16 11 13 2 7 4 9 10 21 3 13 15 1 7 9 13 2 1
881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902
13 2 8 3 1 13 13 19 15 13 9 1 2 5 13 13 13 9 9 3 20 3
903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924
15 8 7 8 15 13 13 10 3 4 13 8 8 6 18 11 13 1 13 18 4 2
925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946
7 8 3 3 16 21 11 9 3 1 6 8 11 20 5 9 15 11 4 3 4 13
947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968
4 8 8 3 3 18 13 1 1 10 18 8 9 8 1 18 13 8 8 9 13 15
969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990
15 21 8 3 13 17 14 5 8 13 8 10 18 7 17 7 3 4 13 21 14 1
991 992 993 994 995 996 997 998 999 1000
14 2 13 5 9 8 2 1 7 9
[ reached getOption("max.print") -- omitted 72578 entries ]
# add cluster ID back into original df
df$phenograph_clusterk271 <- factor(membership(Rphenograph_out_flow[[2]]))
# how many clusters are there?
unique(df$phenograph_cluster)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
# 30 levels !!!! Way to many - because the k is low.
#ggplot(iris_unique, aes(x=Sepal.Length, y=Sepal.Width, col=Species, shape=phenograph_cluster)) + geom_point(size = 3)+theme_bw()
# how many clusters are there?
unique(df$phenograph_clusterk271)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
# I'll have a look at the clustering quickly using two AB
ggplot(df, aes(x=CD44, y=CD71, col = phenograph_clusterk271)) + geom_point(size = 1)+theme_bw()
Save the df with the phenograph cluster indexes to save time.
write.csv(df,"/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/prepro_outsflowsetdf+phenographk50k271.csv")
# seurat object made from the save input as the phenograph clustering
seu <- readRDS("/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/SeuratfromFlowsom.rds")
# read in the df with the phenograph clustering
df <- read.csv("/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/prepro_outsflowsetdf+phenographk50k271.csv")
# add the phenograph cluster indexes
seu <- AddMetaData(object=seu, metadata=df$phenograph_clusterk271, col.name = 'Phenograph.k.271')
See some plots
DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271")
NA
NA
# make a list of AB from the input df - can only use df if filter out all the exta parts
print(colnames(df2))
[1] "AQP4" "CD56" "GLAST" "CD140a" "CD29" "CD44" "CD184" "CD71" "CD24" "CD15"
[11] "O4" "HepaCAM" "CD133"
allAB <- colnames(df2)
DoHeatmap(seu, features = allAB, group.by = "Phenograph.k.271")
DotPlot(seu, features = allAB, group.by = "Phenograph.k.271", cols = c("blue","red"))
DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "flowSOM.k.8")
DoHeatmap(seu, features = allAB, group.by = "flowSOM.k.8")
NA
NA
Save seurat object with Phenograph and FlowSOM.k.8 clusters
saveRDS(seu,"/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/SeuratfromFlowsomPheno.rds" )
Try to optimize UMAP to get better separation of groups
spread.opt <- c(0.1,0.5,0.75,1,5)
a.opt <- c(830,5.07,2.51,1.58,0.14)
b.opt <- c(1.93,1.0,0.93,0.90,0.81)
for (i in 1:5){
seu <- RunUMAP(seu, dims = NULL, n.neighbors = 250, min.dist = 0.1 ,features = allAB, slot = 'scale.data', spread = spread.opt[i], a = a.opt[i], b = b.opt[i])
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271"))
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.1"))
}
16:11:05 Read 73578 rows and found 13 numeric columns
16:11:05 Using Annoy for neighbor search, n_neighbors = 250
16:11:05 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:11:11 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da11ca0b4c2
16:11:11 Searching Annoy index using 1 thread, search_k = 25000
16:13:27 Annoy recall = 100%
16:13:28 Commencing smooth kNN distance calibration using 1 thread
16:13:55 Initializing from normalized Laplacian + noise
16:14:50 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:16:11 Optimization finished
16:16:15 Read 73578 rows and found 13 numeric columns
16:16:15 Using Annoy for neighbor search, n_neighbors = 250
16:16:15 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:16:20 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da17abb3b3c
16:16:20 Searching Annoy index using 1 thread, search_k = 25000
16:18:36 Annoy recall = 100%
16:18:36 Commencing smooth kNN distance calibration using 1 thread
16:19:04 Initializing from normalized Laplacian + noise
16:19:58 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:21:19 Optimization finished
16:21:22 Read 73578 rows and found 13 numeric columns
16:21:22 Using Annoy for neighbor search, n_neighbors = 250
16:21:22 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:21:27 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da11a2e1f17
16:21:27 Searching Annoy index using 1 thread, search_k = 25000
16:28:56 Annoy recall = 100%
16:28:57 Commencing smooth kNN distance calibration using 1 thread
16:29:24 Initializing from normalized Laplacian + noise
16:30:19 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:31:40 Optimization finished
16:31:43 Read 73578 rows and found 13 numeric columns
16:31:43 Using Annoy for neighbor search, n_neighbors = 250
16:31:43 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:31:48 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da149fb2c6e
16:31:48 Searching Annoy index using 1 thread, search_k = 25000
16:34:06 Annoy recall = 100%
16:34:07 Commencing smooth kNN distance calibration using 1 thread
16:34:36 Initializing from normalized Laplacian + noise
16:35:32 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:36:53 Optimization finished
16:36:57 Read 73578 rows and found 13 numeric columns
16:36:57 Using Annoy for neighbor search, n_neighbors = 250
16:36:57 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:37:02 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da19220fb4
16:37:02 Searching Annoy index using 1 thread, search_k = 25000
16:39:21 Annoy recall = 100%
16:39:21 Commencing smooth kNN distance calibration using 1 thread
16:39:49 Initializing from normalized Laplacian + noise
16:40:44 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:42:06 Optimization finished
resolutions = c("RNA_snn_res.0.1","RNA_snn_res.0.25","RNA_snn_res.0.5","RNA_snn_res.0.75","RNA_snn_res.1")
for (res in resolutions){
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = res))
}
NA
NA
# run at a higher spread
seu2 <- seu
seu2 <- RunUMAP(seu2, dims = NULL, n.neighbors = 250, min.dist = 0.01 ,features = allAB, slot = 'scale.data', spread =10, a = 0.05, b = 0.8)
16:50:47 Read 73578 rows and found 13 numeric columns
16:50:47 Using Annoy for neighbor search, n_neighbors = 250
16:50:47 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:50:52 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da152d749e1
16:50:52 Searching Annoy index using 1 thread, search_k = 25000
16:53:06 Annoy recall = 100%
16:53:07 Commencing smooth kNN distance calibration using 1 thread
16:53:35 Initializing from normalized Laplacian + noise
16:54:30 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:55:51 Optimization finished
resolutions = c("RNA_snn_res.0.1","RNA_snn_res.0.25","RNA_snn_res.0.5","RNA_snn_res.0.75","RNA_snn_res.1")
for (res in resolutions){
print(DimPlot(seu2, reduction = "umap", repel = TRUE, label = TRUE, group.by = res))
}
DimPlot(seu2, reduction = "umap", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271")
NA
NA
NA
seu3 <- seu
spread.opt <- c(2,3,5)
a.opt <- c(0.54,0.3,0.14)
b.opt <- c(0.84,0.82,0.81)
for (i in 1:5){
seu <- RunUMAP(seu, dims = NULL, n.neighbors = 250, min.dist = 0.001 ,features = allAB, slot = 'scale.data', spread = spread.opt[i], a = a.opt[i], b = b.opt[i])
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271"))
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.1"))
}
17:05:04 Read 73578 rows and found 13 numeric columns
17:05:04 Using Annoy for neighbor search, n_neighbors = 250
17:05:04 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:05:09 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da134337d44
17:05:09 Searching Annoy index using 1 thread, search_k = 25000
17:07:23 Annoy recall = 100%
17:07:24 Commencing smooth kNN distance calibration using 1 thread
17:07:51 Initializing from normalized Laplacian + noise
17:08:47 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:10:08 Optimization finished
17:10:11 Read 73578 rows and found 13 numeric columns
17:10:11 Using Annoy for neighbor search, n_neighbors = 250
17:10:11 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:10:16 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da120651622
17:10:16 Searching Annoy index using 1 thread, search_k = 25000
17:12:29 Annoy recall = 100%
17:12:29 Commencing smooth kNN distance calibration using 1 thread
17:12:57 Initializing from normalized Laplacian + noise
17:13:51 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:15:12 Optimization finished
17:15:15 Read 73578 rows and found 13 numeric columns
17:15:15 Using Annoy for neighbor search, n_neighbors = 250
17:15:15 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:15:21 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da14c9022cb
17:15:21 Searching Annoy index using 1 thread, search_k = 25000
17:17:34 Annoy recall = 100%
17:17:34 Commencing smooth kNN distance calibration using 1 thread
17:18:02 Initializing from normalized Laplacian + noise
17:18:57 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:20:18 Optimization finished
17:20:22 Read 73578 rows and found 13 numeric columns
17:20:22 Using Annoy for neighbor search, n_neighbors = 250
17:20:22 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:20:27 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da1adc64b2
17:20:27 Searching Annoy index using 1 thread, search_k = 25000
17:22:42 Annoy recall = 100%
17:22:43 Commencing smooth kNN distance calibration using 1 thread
17:23:11 Initializing from normalized Laplacian + noise
17:24:06 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:24:48 Optimization finished
Warning: Removed 73578 rows containing missing values (geom_point).
Warning: Removed 22 rows containing missing values (geom_text_repel).
Warning: Removed 73578 rows containing missing values (geom_point).
Warning: Removed 15 rows containing missing values (geom_text_repel).
17:24:49 Read 73578 rows and found 13 numeric columns
17:24:49 Using Annoy for neighbor search, n_neighbors = 250
17:24:49 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:24:54 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da1b56e7b0
17:24:55 Searching Annoy index using 1 thread, search_k = 25000
17:27:17 Annoy recall = 100%
17:27:17 Commencing smooth kNN distance calibration using 1 thread
17:27:46 Initializing from normalized Laplacian + noise
17:28:41 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
17:29:23 Optimization finished
Warning: Removed 73578 rows containing missing values (geom_point).
Warning: Removed 22 rows containing missing values (geom_text_repel).
Warning: Removed 73578 rows containing missing values (geom_point).
Warning: Removed 15 rows containing missing values (geom_text_repel).
# spread 3
dist.opt = c(0.001,0.005,0.01,0.05)
for (ds in dist.opt){
seu <- RunUMAP(seu, dims = NULL, n.neighbors = 250, min.dist = ds ,features = allAB, slot = 'scale.data', spread = 3, a = 0.3, b = 0.82)
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271"))
print(DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.1"))
}
10:29:35 Read 73578 rows and found 13 numeric columns
10:29:35 Using Annoy for neighbor search, n_neighbors = 250
10:29:35 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:29:40 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da1176ac32d
10:29:40 Searching Annoy index using 1 thread, search_k = 25000
10:31:56 Annoy recall = 100%
10:31:57 Commencing smooth kNN distance calibration using 1 thread
10:32:26 Initializing from normalized Laplacian + noise
10:33:22 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:34:42 Optimization finished
10:34:46 Read 73578 rows and found 13 numeric columns
10:34:46 Using Annoy for neighbor search, n_neighbors = 250
10:34:46 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:34:51 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da16233cb5d
10:34:51 Searching Annoy index using 1 thread, search_k = 25000
10:37:07 Annoy recall = 100%
10:37:08 Commencing smooth kNN distance calibration using 1 thread
10:37:36 Initializing from normalized Laplacian + noise
10:38:32 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:39:53 Optimization finished
10:40:25 Read 73578 rows and found 13 numeric columns
10:40:25 Using Annoy for neighbor search, n_neighbors = 250
10:40:25 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:40:30 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da1366c7909
10:40:30 Searching Annoy index using 1 thread, search_k = 25000
10:45:35 Annoy recall = 100%
10:45:36 Commencing smooth kNN distance calibration using 1 thread
10:46:05 Initializing from normalized Laplacian + noise
10:47:00 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:48:22 Optimization finished
10:48:25 Read 73578 rows and found 13 numeric columns
10:48:25 Using Annoy for neighbor search, n_neighbors = 250
10:48:25 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:48:30 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpFTK9OU/file17da1b7e59c9
10:48:31 Searching Annoy index using 1 thread, search_k = 25000
10:50:48 Annoy recall = 100%
10:50:49 Commencing smooth kNN distance calibration using 1 thread
10:51:17 Initializing from normalized Laplacian + noise
10:52:13 Commencing optimization for 200 epochs, with 14776612 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
10:53:35 Optimization finished
Maybe tSNE works better with this data type than UMAP
#seu <- RunTSNE(seu, dims = 1:10)
DimPlot(seu, reduction = "tsne", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.1")
DimPlot(seu, reduction = "tsne", repel = TRUE, label = TRUE, group.by = "Phenograph.k.271")
NA
NA
# the CHI shows this is the best resolution
DimPlot(seu, reduction = "tsne", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.0.1")
DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.0.1")
DimPlot(seu, reduction = "pca", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.0.1")
library(clustree)
Loading required package: ggraph
clustree(seu, prefix = "RNA_snn_res.") + theme(legend.position = "bottom")
Warning: The `add` argument of `group_by()` is deprecated as of dplyr 1.0.0.
Please use the `.add` argument instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
DimPlot(seu, reduction = "umap", repel = TRUE, label = TRUE, group.by = "RNA_snn_res.1.75")
DoHeatmap(seu, features = allAB, group.by = "RNA_snn_res.1.75")
# save object with more cluster resolutions
saveRDS(seu,"/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/SeuratfromFlowsomPheno.rds" )
Add cluster annotation from manual annotation for the highest resolution
saveRDS(seu,"/Users/rhalenathomas/Documents/Data/FlowCytometry/PhenoID/Analysis/9MBO/prepro_outsjan20-9000cells/SeuratfromFlowsomPhenoLabels.rds" )